Coarse Graining for Synchronization in Directed Networks 
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Coarse graining model is a promising way to analyze and visualize large-scale networks. The 
coarse-grained networks are required to preserve statistical properties as well as the dynamic be- 
haviors of the initial networks. Some methods have been proposed and found effective in undirected 
networks, while the study on coarse graining directed networks lacks of consideration. In this paper, 
we proposed a Path-based Coarse Graining (PCG) method to coarse grain the directed networks. 
Performing the linear stability analysis of synchronization and numerical simulation of the Kuramoto 
model on four kinds of directed networks, including tree networks and variants of Barabasi-Albert 
networks, Watts-Strogatz networks and Erdos-Renyi networks, we find our method can effectively 
preserve the network synchronizability. 

PACS numbers: 89.75.Hc, 05.45.Xt, 89.75.Fb 
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I. INTRODUCTION. 

Complex networks have become a key approach to un- 
derstanding many social, biological, chemical, physical 
and information systems, where nodes represent individ- 
uals and links denote the relations or interactions be- 
tween nodes. In this sense, to study the dynamics of 
such systems is actually to investigate the dynamical 
behaviors on the networks. In particular, the network 
synchronization as an important emerging phenomenon 
of a population of dynamically interacting units in vari- 
ous fields of science has attracted much attention (l|-[l2| . 
Most works focused on studying the relation between net- 
work topology and the synchronization ^-7], enhancing 
the synchronizability by designing the weighting strate- 
gies |8l-[l3|. Moveover, some efforts have been made to 
study the synchronization in directed networks [l3l - [T7| . 
It has been pointed out that the optimal structure for 
synchronizability is a directed tree [l^ ^^'^ 'con- 
vergence time is strongly related to the depth of the 
tree [ll,[l^. Most of the experiments on investigating the 
dynamic behaviors are implemented on small-size net- 
works. However, when the networks contains very large 
number of nodes, it becomes sometimes impossible to 
model the dynamic process. For example, to investigate 
the synchronization, extrapolating the coupled differen- 
tial equations model of a single node to this large system 
is too complicated to be carried out. 

A promising way to address this problem is to coarse 
grain the network, namely to reduce the network com- 
plexity by means of mapping the large network into a 
smaller one. The coarse graining techniques have been 
successfully applied to model large genetic networks llSll 
and extract the slowest motions in protein networks [IQj . 
Essentially, the coarse graining process is very similar to 
the problem of cluster finding or community detection in 
networks (see Ref. [20l - |23j for some popular methods). 
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The coarse-grained network is obtained by merging the 
nodes in the same cluster or community. However, the 
coarse graining model is far beyond the clustering tech- 
niques, since it requires the coarse-grained networks keep 
the same topological properties or dynamic behaviors as 
the initial networks, such as preserving the degree distri- 
bution, cluster coefficient, assortativity correlation [23 |. 
the properties of random walk on the network [2^, syn- 
chronization [26] and critical phenomena [27j. Most of 
the former works on coarse graining consider undirected 
networks. However, in many real systems, the interac- 
tions between individuals are not reciprocal. For ex- 
ample, the food web, gene regulation system, metabolic 
system and neural system are usually represented by di- 
rected networks where the nodes are affected by their 
upstream nodes. In directed networks, of course we can 
ignore link directions and apply methods developed for 
undirected networks, but this approach discarding poten- 
tially useful information contained in the link directions 
may lead to dramatically change of the key organizational 
features when coarse graining the networks 28J. In ad- 
dition, some prominent methods may confront problems 
when applied to directed networks. Among all these ex- 
isting coarse graining methods for undirected networks, 
the spectral coarse graining (SCG) method is a very gen- 
eral method which can be applied in many dynamic pro- 
cesses such as synchronization, random walk and epi- 
demic spread [l^, HI]. In order to preserve a typical 
eigenvalue, the SCG method coarse grains the nodes with 
similar elements in the corresponding eigenvectors. For 
different dynamic processes, different eigenvalues should 
be considered. Therefore the choice of the eigenvectors 
is indeed problem dependent. As the synchronizability is 
mainly related to the largest and smallest nonzero eigen- 
value, the SCG method for synchronization takes the p2 
and pn into consideration (p2 and pn are respectively 
the eigenvectors for the smallest nonzero and the largest 
eigenvalue). However, this method may not provide good 
performance in directed networks since the eigenvector 
elements cannot successfully characterize the nodes' dy- 
namic role. For instance, the nodes in different layers 
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may have exactly the same eigenvector elements in di- 
rected acyclic networks while the nodes with exactly the 
same topology may have totally different eigenvector el- 
ements in directed networks with cycles. In a word, to 
design an effective coarse graining method for directed 
networks is still challenging. 

In this paper, we propose a Path-based Coarse Grain- 
ing (PCG) method to coarse grain directed networks for 
synchronization. The basic idea is that the nodes who 
obtain the same impacts from other nodes are similar to 
each other, and thus can be merged. The impacts that 
one node receives from other nodes are calculated via 
tracing the origin of the source in the directed networks 
(i.e., along the opposite direction of links). It has been 
pointed out that the dynamical correlation can be pre- 
dicted from such topological similarity 29]. Therefore, 
coarse graining in this way will most naturally merge 
the nodes according to their functional performance and 
likely preserve the dynamical properties. The linear sta- 
bility analysis of synchronization and numerical simula- 
tion of the Kuramoto model on four kinds of directed net- 
works, including tree networks and variants of Barabasi- 
Albert networks, Watts-Strogatz networks and Erdos- 
Renyi networks, show that our method can effectively 
preserve the synchronizability of the initial directed net- 
works. Additionally, we find the far sources play more 
important roles when identifying the nodes' roles in di- 
rected networks with obvious hierarchy structure, while 
the near sources are more important in the directed net- 
works with many loops. 

II. PATH-BASED COARSE GRAINING (PCG) 
METHOD 

A. Definition of node's impact-vector 

Many structural-based similarity indices have been 
proposed to quantify the nodes' similarity [30l - l3^ . most 
of which only work for undirected networks. How to de- 
fine the nodes' similarity in directed networks is still a 
challenge. Here we propose a method via tracing the ori- 
gin of impacts in directed network. The basic assumption 
is that two nodes are structural-similar if they obtain the 
same impacts from other nodes, and thus they are more 
likely to be merged during the coarse graining process. 
Given a directed network G{V,E), where V and E denote 
the set of nodes and directed links respectively. Multiple 
links and self-connections are not allowed. The impact 
of node x on node y is defined by summing over the col- 
lection of directed paths from x to y with exponential 
weights by length. The mathematical expression reads 

fx^y = J2 (3%ath<^y\, (1) 
1=1 

where \path^!:^y\ is the set of all directed paths with 
length / starting from node x to node y. Mathematically, 



\path^':^y\ = {A^)xy, where A is the adjacency matrix: if 
X points to y A^y = 1, otherwise A^y = 0. /? is a free pa- 
rameter that controls the weights of the paths. Smaller /3 
indicates assigning more weights on the short paths, and 
vice versa. Here the paths whose lengths are not larger 
than Imax are considered. If Imax = oo, namely consid- 
ering all directed paths from x to y, Eq. [1] is similar to 
the Katz index 33]. However, the significant difference 
is twofold: On one hand, the adjacency matrix in Katz 
index is symmetrical while asymmetrical in Eq. [T] On 
the other hand, the parameter l3 is usually smaller than 
unit in Katz index, namely assigning more weights to 
the short paths, while in Eq.[T]/3 has no limitation. Since 
counting all paths between every pair of nodes is very 
time-consuming especially in large networks, we here set 
Imax equal to the length of the longest path among all 
the shortest paths between two nodes. Note that when 
Imax = oo and /3 is smaller than the reciprocal of the 
largest eigenvalue of A (i.e., ensure the convergence), the 
impact matrix F with element fxy defined in Eq. [1] can 
be directly calculated by F = (/ — /3A)^^ — I. 

B. Group partition via k- means clustering 

We assign each node x a N dimensional impact-vector 
which is equal to the xth column of matrix F, namely 

fx = {fixj2xj3xr-- JnxV, where N = \V\ is the 
number of nodes. Clearly, if two nodes receive the same 
impacts from their ancestors (i.e., upstream nodes), they 
tend to have the same phase in synchronization, and thus 
are more likely to be merged during coarse graining. Sup- 
pose we are going to coarse grain a network containing 
N nodes to a smaller one with K nodes. We adopt the 
k-means clustering method [33| to partition the N nodes 
into K groups. The nodes in the same group will be 
merged. The k-means clustering technique aims at min- 
imizing the within-cluster sum of squares: 

K 

^ = E E II/- -^1^)11'' (2) 

i=l xeV{i) 

where V{i) is the set of nodes in cluster i (i = 
1,2,- •• ,K), and c{i) is the centroid of cluster i which 
is equal to the mean of points in cluster i, namely 

CfeW^TTTT-^ E (3) 

' ^ xev{i) 

The detailed steps of k-means clustering are shown as 
follows: (i) Choose K vectors as the initial centroid of 
each cluster, (ii) Randomly choose a node x from the 
set V. This node will belong to the cluster i if the dis- 
tance between its vector fx and the centroid of cluster i, 
namely c(«), is the minimum among all the centroids of 
K clusters, (iii) Update the centroid of each cluster ac- 
cording to Eq. 131 (iv) Repeat steps (ii) and (iii) until all 
the centroids cannot be modified. Note that for a given 
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K, clusters will depend heavily on the initial configura- 
tion of the set of centroids, thus making interpretation of 
the clusters rather uncertain. Different initialization may 
lead to different solutions which may trapped in the local 
minimum. Clusters should be, as much as possible, com- 
pact, well separated, and interpretable, possibly with the 
help of some additional variables, such as the _F-statistic. 
Here we only focus on whether the clusters are compact, 
namely the vectors (nodes) within one cluster are close 
(similar) enough, while neglect if the clusters are well 
separated. Therefore we will finally choose the clustering 
result subject to the lowest E among L-possible solutions 
(we set L = 20 in this paper). 

C. Weighting strategy for the coarse-grained 
networks 

Another crucial problem in the process of coarse grain- 
ing is how to update the links' weights after merging the 
nodes so that the resulting network is truly representa- 
tive of the initial one. An effective weighting strategy 
was proposed by Gfeller et al. HI]. Here we apply it 
to directed networks. Specifically, when we merge the 
nodes in cluster i to form a new node labeled by m^, 
the weights of the merged links update according to the 
following principle: 

Wx^m, = '"'~^\vii) \ ' "^^'^ in-links 

Wm.^x = S Wx-iy, for mi's out-links 

. . (4) 

where Wx^y indicates the weight of the directed link from 
X to y, which can also be interpreted as the coupling 
strength. A simple illustration is shown in Fig. [1] The 
initial network as shown in Fig. [TJa) is constituted of 
seven nodes and eight directed links. Assuming the ini- 
tial links' weights are all equal to unit. After merging the 
three nodes a, b and c, a new node m is generated, and ac- 
cording to Eq. |3]the links weights in the reduced network 
are drawn as in Fig. [IJb). Indeed, since the three nodes 
in total receive three in-links from node d, while two from 
node e, the weights of m's two in-links are respectively 
Wd^m = 3/3 = 1 and We~^m = 2/3. For m's out-links, 
since the three nodes have two out-links to node / and 
one to g, the weight are respectively Wm^f = 1-1-1 = 2 
and Wm^g = 1. 

Under the framework of master stability analysis, the 
synchronizability of an undirected network can be quan- 
tified by the ratio between the largest and the smallest 
non-zero eigenvalues of the Laplacian matrix of this net- 
work, namely R — where Xn and A2 are respec- 
tively the largest and the smallest non-zero eigenvalues 
of the Laplacian matrix (35l - l37| . In directed networks, 
since the Laplacian matrix, defined as = k™5ij — aij, 
is asymmetric with zero rowsum, it has complex eigenval- 
ues. In order to achieve the synchronization condition. 




(a) (b) 

FIG. 1: (Color online) A simple illustration of how to update 
the links' weights in the coarse graining process, (a) shows the 
initial network constituted of seven nodes and eight directed 
links, (b) is the reduced network after merging the three 
nodes a, b and c. Numbers on the links indicate the new 
weights of the hnks. 



every eigenvalue is entirely contained in the region of neg- 
ative Lyapunov exponent for the particular master sta- 
bility function. Once the stability zone is bounded and 
the imaginary part of complex eigenvalue is small enough, 
the network synchronizability can be approximately mea- 
sured by the real part of eigenvalue ratio R = A^/Aj, 
where A^ and A2 are respectively the largest and the sec- 
ond smallest real parts of eigenvalues [ll|, [s^ . Gen- 
erally speaking, the stronger the synchronizability, the 
smaller the ratio R. Note that an accurate index for mea- 
suring the synchronizability in directed networks has not 
yet been proposed and asks for further studies. Here, we 
use the approximate index R = A^ /A2 as an indicator to 
see whether the synchronizability of a directed network 
changes after coarse graining. Usually Aat is proportional 
to the largest degree fcmax (i-e., largest node's strength in 
weighted network) of the network and A2 corresponds 
to the lowest degree /cmin (i-e., lowest node's strength in 
weighted network) Therefore, keeping the /cmax and 
/cniin unchanged can effectively help to maintain the syn- 
chronizability after coarse graining. Thus, in the coarse 
graining process, the nodes with largest and smallest in- 
degrees can only be merged if the k^a.x and /cmin of the 
coarse-grained network are respectively equal to that of 
the initial network. Otherwise, we will randomly selected 
two nodes, one with largest in-degree and the other with 
the smallest in-degree, before group partition. Then the 
rest N — 2 nodes will be classified into K — 2 groups ac- 
cording to k-means clustering. Note that, unless stated 
otherwise, k always refers to the in-degree. In appendix, 
we further discuss the effect of the constraint of keeping 
^max and fcmin ou coarsc graining results. It shows that 
the eigenvalue ratio R is sensitive to fcmax and k^m, while 
the order parameter of Kuramoto model does not. 

Finally, for the aspect of computational complexity, 
the k-means clustering algorithm is of 0{N^), and the 
time complexity of calculating the impact- vector F is 
0{N^). Likewise, the calculation of eigenvectors in SCO 
method also takes 0{N^). However, with the develop- 
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ment of computing techniques, lots of fast calculation 
algorithms can help to reduce the computational com- 
plexity and make our method be able to deal with large 
networks. For example, the computational complexity 
of Katz index (i.e., the case for (3 < 1) can be reduced 
to 0{N + M) where M is the number of edges in the 
network [40| . 



III. RESULTS 

A. Coarse graining on modeled networks 

We apply the Path-based Coarse Graining (PCG) ap- 
proach to four kinds of directed networks: (i) Directed 
tree network. A tree with N nodes and L layers is gen- 
erated starting from a directed train with length L, in 
which each node represents a layer. Then rest N ~ L 
nodes are added one by one. Each new added node is 
connected by a directed link starting from one of its an- 
cestors which are not located in the layer L. (ii) A variant 
of Barabasi- Albert networks [4l|: Directed BA network. 
An acyclic directed BA network is generated by using the 
mechanism for undirected BA network and assuming the 
link direction can only from older node to younger node, 
(iii) A variant of Watts-Strogatz networks [i^l : Directed 
WS network. The model starts from a completely regular 
network with identical degree and clockwise links. Each 
link will be rewired with two randomly selected nodes 
with probability o (g (0,1)). (iv) A variant of Erdos- 
Renyi networks [4j| : Directed ER random network. The 
directed ER random networks can be generated by set- 
ting q = 1. 

Firstly, we investigate the performance of PCG on 
above four kinds of networks. The synchronizability of 
the coarse-grained network R in the (/3, K) plane is shown 
in Fig. [31 where K is the size of the coarse-grained net- 
work. Interestingly, we find that in tree network and 
acyclic BA network larger (3 in average provides better 
results than smaller /?. Especially, in tree network there is 
an obvious line at /3 w 1. In the BA network, with /3 > 1 
the coarse-grained network can keep the synchronizabil- 
ity exactly the same as the initial network. In the cyclic 
WS network the /3 that best preserves the synchronizabil- 
ity is around 0.1. It seems that networks with more loops 
tend to obtain better coarse graining with smaller /3 (see 
subsection C for detailed discussion of the relationship 
between the optimal parameter /3* and the number of 
loops in network) . The result in Fig. [IJd) shows that the 
synchronizability of the coarse-grained ER network is not 
sensitive to /3 regardless of K , since the total fluctuation 
is smaller than 0.07. 

We compare the PCG method with other two methods, 
namely Random Coarse Graining (RCG) and Spectral 
Coarse Graining (SCG) In RCG, the N elements 

of each node's vector are randomly selected in the range 
of (0,1). Then the nodes will be classified into K groups 
by using k-means clustering. In directed networks, the 




(c)WS network (d)ER network 



FIG. 2: (Color online) The synchronizability R in the {P,K) 
plane for (a) directed tree networks (A'' = 1000, L = 20), (b) 
directed BA network (Af = 1000, k = 3), (c) directed WS 
network {N = 1000, _fc = 10, g = 0.1) and (d) directed ER 
network (A^ = 1000, fc = 10). 



egeinvalues and egeinvectors of their laplacian matrixes 
have complex values. When we apply SCG to directed 
networks, we consider only the real parts of the values in 
this paper. In practice, we define / equally distributed 
intervals between the maximum and minimum oip2 (p^pf), 
where P2 and are the egeinvectors corresponding to 
the second smallest and the largest real-part-egeinvalues 
of the laplacian matrix, respectively. The nodes whose 
eigenvector components in P2 (p^v) f^-^ l^e same in- 
terval are merged. Specifically, if the elements in both 
P2 and are identical, we will randomly divide the 
nodes into K groups. Actually, the relation between I 
and K strongly depends on the network structure (i.e., 
the distribution of the elements in P2 and p^). For in- 
stance, considering the initial WS and ER network shown 
in Fig. [21 when / = 800 the size of the coarse-grained 
WS network is 951, while the reduced ER network only 
contains 281 clusters. Note that there are many poten- 
tial ways to apply the SCG method to directed networks 
making use of the imaginary part of the elements in 
eigenvectors. For example, the vectors p2 (pn) can be 
generated by combining the real part and the imaginary 
part (such as \/(p^)2 4- (pi,)^^ P2 + Ph, P2Ph, et ah). An- 
other way is grouping the nodes according to four vec- 
tors, namely {p2jP2'Pn^p)si)- imaginary parts are 
appropriately considered, the performance of SCG can be 
improved. However, how to find the right way to make 
use of the imaginary parts is a tough problem and inap- 
propriately involving the imaginary parts in SCG method 
may lead to even worse results. 

Figure [3] shows how the indicator R changes with K 
on the above four kinds of directed networks with typical 
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/?. Overall speaking, PCG outperforms SCG, and RCG 
performs worst. As shown in Fig. [3l^a) (b) and (c), with 
SCG and RCG, the synchronizability changes even only 
a few nodes have been merged, while the PCG displays 
a large stable range. In a directed tree network, most of 
the elements in eigenvectors corresponding to the small- 
est and the largest eigenvalues are identical. Thus it is 
impossible to distinguish the role of nodes by the analysis 
on the eigenvectors as suggested in Ref. [2g. In the tree 
networks, the most effective coarse graining strategy is 
to merge the nodes in the same layer. PCG can indeed 
well identify the nodes in different layers by using a larger 
parameter /3 (> 1). In this sense, PCG is very effective 
in acyclic directed tree network. From Fig. ^a) , one can 
see that when K > L, PCG can keep the synchroniz- 
ability exactly the same as the initial network. When 
K = L, the tree will be reduced to a train with length 
L, namely all the nodes in the same layer are merged. 
When K < L, there exist a suddenly jump of R, see in- 
set of Fig. [2Ia). This is caused by merging the nodes 
in different layers and thus leading to a smaller fcmin ac- 
cording to the weighting strategy in Eq. HI In this case, 
if we artificially set fcmin of the reduced network equal to 
that of the initial network, the synchronizability can be 
well preserved (exactly equal to 1). Similar phenomenon 
exists in acyclic BA network where the hierarchical struc- 
ture is clear. 

It has been demonstrated that the synchronizability of 
the directed BA network with average in-degree A: = 3 is 
exactly 3 [11]. Figure [HJb) shows that PCG with param- 
eter /3 = 5 can guarantee i? = 3 by keeping the network 
acyclic and /cmax and fcmin unchanged, even the network 
is reduced to 30 clusters (i.e., K — 30). When K < 30, 
merging may generate some loops and decrease fcmin , and 
thus lead to a sharp increase of R. It can not be perfectly 
avoided by artificially keeping fcmin as what we did in the 
tree network, instead R can effectively reduce to around 
3, since here the loops also play a role. On the con- 
trary, the SCG method may induce loops even merging 
a few nodes (i.e., for a larger K). For example, when 
K = 600, the synchronizability of the reduced SCG net- 
work is i? = 3.77, while synchronizability of the reduced 
PCG network is exactly equal to 3. 

In the networks with cycles including directed WS net- 
works and directed ER networks, there are no clear hier- 
archical structures, thus the local information (i.e., short 
paths) plays more important role to quantify the node's 
impact during the coarse graining process, and thus a 
relative small /? is required. Here we use /? = 0.1. The 
results show that the PCG method performs as well as 
SCG method in directed ER networks while much better 
than SCG and RCG in directed WS networks. 

In addition, we point out that grouping the nodes aim- 
ing at preserving the dynamics cannot maintain the topo- 
logical properties at the same time, although the group- 
ing is according to the topological similarity. Generally, 
the average degree of the coarse-grained network is larger 
than that of the initial network. For comparison, we gen- 
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(c) RCG SCG -e- PCG I (d) 

FIG. 3: (Color online) The evolution of the ratio R — ^^1^2 
as a function of the size of the coarse-grained network K. The 
initial networks are the same to the ones in Fig. (2] We use 
the typical parameter 13 — 5 for (a) directed tree networks and 
(b) directed BA networks, and /3 = 0.1 for (c) directed WS 
networks and (d) directed ER networks. The results for RCG 
and PCG are obtained by averaging over 100 independent 
network realizations. Insets show the results for K £ [2, 100]. 



erate a group of modeled networks which have the same 
topological properties as the initial network and same 
size as the coarse-grained network. It is shown that the 
average number of reachable nodes and loop number of 
the coarse-grained networks are always higher than that 
of the modeled networks, while the average shortest dis- 
tance of the coarse-grained networks is always smaller 
than the modeled networks. Moreover, the coarse grain- 
ing procedure may change the degree distribution of the 
initial networks. However, the topological properties of 
the PCG networks are relatively closer to the initial net- 
works than the SCG networks especially in the acyclic 
networks (not so obvious in directed networks with cy- 
cles). For example, the PCG method can prevent the 
producing of loops and keep the coarse-grained networks 
still partial reachable. 



B. Kuramoto model on coarse-grained networks 

Since the Laplacian matrixes for directed networks are 
asymmetric, the egeinvalues A2 and Xn are complex. In 
this case, the indicator R can not exactly represent the 
synchronizability of a network. Hence, we further test 
our method with the Kuramoto model [3, EH , which is 
a classical model to investigate the phase synchroniza- 
tion phenomena. The coupled Kuramoto model in the 
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FIG. 4: (Color online) Given a specific directed tree network 
with A'^ = 100 nodes and L = 10 layers as shown in (a), 
the coarse-grained networks through PCG and SCG are re- 
spectively presented in (b) and (c) which are constituted of 
K = 10 clusters. Figure (d) shows the performance of Ku- 
ramoto model on these three networks, namely (a) original 
network, (b) PCG network and (c) SCG network. Figures 
(e) and (f ) show respectively the results of WS network (with 
iV = 100, fc = 4, g = 0.1) and BA network (with N = 100 
and A; = 3). Their coarse-grained networks all contain K — 25 
clusters. The coupling strength is cr = 10 and Wi is randomly 
selected in the range of (—0.5,0.5). Initially, 6i is randomly 
chosen in (— tt, tt). 



network can be written as 



N 



« = 1,2 ,7V (5) 



where Wi and 9i are the natural frequency and the phase 
of oscillator i respectively, and A is the adjacency ma- 
trix. The collective dynamics of the whole population is 
measured by the macroscopic complex order parameter. 



1 



N 



(6) 



where the modulus r{t) (g [0, 1]) measures the phase co- 
herence of the population and <j){t) is the average phase. 
r{t) ~ 1 and r{t) ~ describe the limits in which all 
oscillators are respectively phase locked and moving in- 
coherently. By studying the behavior of the order param- 
eter r{t), we are able to investigate whether the topology 
of the coarse-grained network is representative of the ini- 
tial one. The initial network is a tree network as shown 
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FIG. 5: (Color online) (a) The dependence of the number of 
loops with different length on the reshuffling steps in directed 
BA network (iV = 100, fc = 5). (b) The 13' as function of the 
reshuffling steps. Each point is obtained by averaging over 
100 independent network realizations. 



in Fig. mja), which contains 100 nodes and has 10 layers. 
After the PCG procedure, we obtain a train-like network 
with depth equal to 10, see Fig. HJb). With the SCG 
method, a cyclic network will be generated as shown in 
Fig. Hfc). The result of how the order parameter r{t) 
of Kuramoto model performs in these three networks is 
shown in Figl^d). It is obvious that r{t) of the PCG net- 
work converges with almost the same speed as the initial 
one, while in the SCG network it converges faster. More- 
over, the results of directed WS network and BA network 
are respectively shown in Fig.Ul^e) and (f). Their coarse- 
grained networks all contain 25 clusters. It is clearly that 
the PCG method can preserve the synchronizability more 
effectively than SCG. 



C. The optimal parameter /3* for different networks 

In different networks, the optimal parameters p* cor- 
responding to the best performance on coarse graining 
are different. Empirically, the /3* of acyclic networks is 
larger than that of those containing loops. To investi- 
gate whether /3* is affected by the cycles in networks, we 
carry out an experiment based on directed BA networks, 
on which loops are generated by reshuffling some links. 
Specifically, we randomly select two directed links from 
the network, for example, one is from node A to B and 
the other is from node C to D. Then we rewired these 
two links as A to D and C to B. In this way, the degree 
of these nodes will not be changed during the reshuffling 
procedure. In average, reshuffling more links leads to 
more loops, see an example in Fig. [SJa) where the num- 
bers of loops with length 3, 4 and 5 all increase with the 
increasing of reshuffling steps. Now, we would like to find 
the optimal parameters for the reshuffled networks. For a 
given network, /3* might be different with different K as 
we have shown in Fig. [51 However, in practice, checking 
the optimal parameter for different K in advance is some- 
times impossible. Thus, we here ignore the relationship 
between j3 and K , and consider the general performances 
of one parameter on the coarse-grained networks with the 
possible sizes we concerned. The j3* is thus correspond- 
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ing to the /3 that yields the minimum synchronizabihty 
difference between the coarse-grained networks and the 
initial networks, which can be mathematically expressed 
by: 

N 

d=J2\RK~Ro\ (7) 

K=n 

where Rk is the synchronizabihty of the coarse-grained 
network with K nodes, Rq is the synchronizabihty of the 
initial network and n is the minimum size of the coarse- 
grained network that we considered. Since too small K 
may lead to dramatic change of R, here we choose n = 10 
in the example shown in Fig. [5] We obtain /3* subject to 
the minimum d. The dependence of /3* on the number of 
reshuffling steps is shown in Fig. [SJb). Instead of consid- 
ering all possible j3 which is very time consuming, we test 
the parameter /? in the range of [0.01,10] with step 0.01, 
0.1 and 1 respectively in [0.01,0.1), [0.1,1) and [1,10]. It is 
clear that /?* decreases with the increasing of reshuffling 
steps. Actually, if directed networks have obvious hierar- 
chical structure and rare loops, PCG can perform better 
with a relatively large /3 since it emphasizes on long path 
to detect the hierarchical structure. However, in directed 
networks with many loops, the hierarchical structure is 
not clear. As a path involved in loops can be regarded 
as an infinite long path, its effect on the impact-vector 
will be enormously amplified with a large /3, and thus 
leading to noise when characterizing the dynamic role of 
a node. In this case, it is better to pay more attention to 
the impacts from local structure, namely emphasize the 
effects of short paths by using small /3. 



a parameter /3. Larger /3 indicates the long paths are 
more important (i.e., assign more weights to the long 
paths). Numerical analysis on four kinds of directed 
networks, including tree-like networks and variants of 
Barabasi- Albert networks, Watts-Strogatz networks and 
Erdos-Renyi networks, shows that our method can ef- 
fectively preserve the synchronizabihty during the coarse 
graining process. This result is further demonstrated by 
the Kuramoto model. In addition, we find that the long 
paths play more important roles on the coarse graining 
in the tree-like networks, while in the cyclic networks, 
the long paths that involve the loops usually have neg- 
ative effects on quantifying the impacts of one node on 
the other nodes during the coarse graining process, and 
thus a smaller parameter (3 gives better performance. 

Finally, we claim that the idea for merging nodes which 
receive the same impacts from the network is quite gen- 
eral for coarse graining directed networks. For example, 
for random walk, two nodes in a directed network having 
exactly the same upstream neighbors should be grouped 
together since their random walker probabilities come 
from the same sources. In this sense, coarse graining 
directed networks for other dynamics can be interesting 
extensions. 
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IV. CONCLUSION 

Coarse graining is an effective way to analyze and vi- 
sualize large networks. Many methods and models have 
been proposed to reduce the size of the networks and pre- 
serve main properties such as degree distribution, cluster 
coefficient, degree correlation, as well as some dynamic 
behaviors such as random walks, synchronizabihty and 
critical phenomena. However, most of these works take 
into account the undirected networks, while the study on 
coarse graining of directed networks lacks of attention. 
In this paper, we introduce a Path-based Coarse Grain- 
ing (PCG) method which assumes that two nodes are 
structural-similar if they obtain the same impacts from 
other nodes, and thus they are more likely to be merged 
during the coarse graining process. The impacts that a 
node obtained from other nodes are calculated via trac- 
ing the origin of impacts in directed network. Specifically, 
the impact of node x on node y is defined by summing 
over the collection of directed paths from x to y with 
exponential weights by length, which are controlled by 



Appendix A: PCG method without degree 
constraint 

In the paper, we assumed that the nodes with largest 
and smallest in-degrees can only be merged if the fcmax 
and fcmin of the coarse-grained network are respectively 
equal to that of the initial network. In order to inves- 
tigate the effect of keeping the maximum and minimum 
in-degree on the coarse graining result, we remove the 
constraint of fcmax and fc,nin in PCG and see the per- 
formance of the modified PCG method. As we mainly 
consider synchronization, the indicator R is shown to be 
sensitive to the fcmax and fcmin, see Fig. [S] It is obvious 
that the PCG with constraint performs better than that 
without constraint. However, as shown in Fig. [71 the 
order parameter r{t) of the Kuramoto model does not 
show obvious differences. Moreover, we compared the 
RCG with and without the in-degree constraint. The re- 
sult shows that the degree constraint cannot prominently 
improve the performance of RCG. 
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FIG. 6: (Color online) Comparison of the PCG with and with- 
out the constraint of k 
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All the parameters in 
this figure are the same to the ones in Fig. [21 
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